{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ch `11`: Concept `02`\n",
    "\n",
    "## Embedding Lookup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import TensorFlow, and begin an interactive session"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "sess = tf.InteractiveSession()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's say we only have 4 words in our vocabulary: *\"the\"*, *\"fight\"*, *\"wind\"*, and *\"like\"*.\n",
    "\n",
    "Maybe each word is associated with numbers.\n",
    "\n",
    "| Word   | Number | \n",
    "| ------ |:------:|\n",
    "| *'the'*    | 17     |\n",
    "| *'fight'*  | 22     |\n",
    "| *'wind'*   | 35     |  \n",
    "| *'like'*   | 51     |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "embeddings_0d = tf.constant([17,22,35,51])\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or maybe, they're associated with one-hot vectors.\n",
    "\n",
    "| Word   | Vector | \n",
    "| ------ |:------:|\n",
    "| *'the '*   | [1, 0, 0, 0]     |\n",
    "| *'fight'*  | [0, 1, 0, 0]     |\n",
    "| *'wind'*   | [0, 0, 1, 0]     |  \n",
    "| *'like'*   | [0, 0, 0, 1]     |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "embeddings_4d = tf.constant([[1, 0, 0, 0],\n",
    "                             [0, 1, 0, 0],\n",
    "                             [0, 0, 1, 0],\n",
    "                             [0, 0, 0, 1]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This may sound over the top, but you can have any tensor you want, not just numbers or vectors.\n",
    "\n",
    "| Word   | Tensor | \n",
    "| ------ |:------:|\n",
    "| *'the '*   | [[1, 0] , [0, 0]]    |\n",
    "| *'fight'*  | [[0, 1] , [0, 0]]     |\n",
    "| *'wind'*   | [[0, 0] , [1, 0]]     |  \n",
    "| *'like'*   | [[0, 0] , [0, 1]]     |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "embeddings_2x2d = tf.constant([[[1, 0], [0, 0]],\n",
    "                               [[0, 1], [0, 0]],\n",
    "                               [[0, 0], [1, 0]],\n",
    "                               [[0, 0], [0, 1]]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's say we want to find the embeddings for the sentence, \"fight the wind\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ids = tf.constant([1, 0, 2])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use the `embedding_lookup` function provided by TensorFlow:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[22 17 35]\n"
     ]
    }
   ],
   "source": [
    "lookup_0d = sess.run(tf.nn.embedding_lookup(embeddings_0d, ids))\n",
    "print(lookup_0d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0 1 0 0]\n",
      " [1 0 0 0]\n",
      " [0 0 1 0]]\n"
     ]
    }
   ],
   "source": [
    "lookup_4d = sess.run(tf.nn.embedding_lookup(embeddings_4d, ids))\n",
    "print(lookup_4d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[[0 1]\n",
      "  [0 0]]\n",
      "\n",
      " [[1 0]\n",
      "  [0 0]]\n",
      "\n",
      " [[0 0]\n",
      "  [1 0]]]\n"
     ]
    }
   ],
   "source": [
    "lookup_2x2d = sess.run(tf.nn.embedding_lookup(embeddings_2x2d, ids))\n",
    "print(lookup_2x2d)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
